Efficient Large Scale Continuous Selection-Join Queries Based on Multidimensional Index
نویسندگان
چکیده
We consider the problem of large number of continuous selection-join queries over data streams. As far as we know, in current data stream management systems, events are filtered based on the query plan(s) which are created according to continuous queries defined by users. Even many kinds of optimizations on query plans have been proposed, there are few proposals on the processing of continuous selection-join queries. The query plan-based processing of continuous selection-join queries schedules the executions of operators and manages the buffers for intermediate results used by selection or join operators. When the number of continuous selection-join queries becomes larger, the processing based on query plan is inefficient for two reasons: 1)the system has to schedule large amount of operators, and 2) the memory will be consumed quickly by the larger number of buffers used by the operators. In this paper, we introduce an efficient event filtering algorithm based on multidimensional index structure, where the event filtering is considered as a query in multidimensional space. Because our proposal is not based on query plan, no operator scheduling is needed. At the same time, only one global buffer for join operation is kept, accordingly much larger sliding window can be defined. We first introduce an event filtering model for continuous selection-join queries. Based on the model, an optimized event filtering algorithm is proposed to filter the stream of join results efficiently. The evaluation results show that the event filtering of large number of continuous selection-join queries based on our model has good scalabilities with respect to the number of the queries and number of dimensions. The performance of the event filtering is improved significantly by the proposed algorithm, especially for the continuous selection-join queries with large sliding window.
منابع مشابه
Communication-Efficient Implementation of Range-Joins in Sensor Networks
Sensor networks are multi-hop wireless networks of resource constrained sensor nodes used to realize high-level collaborative sensing tasks. To query and access data generated and stored at the sensor nodes, the sensor network can be looked upon as a distributed database. The unique characteristics of sensor networks such as limited memory and energy resources at each node make efficient execut...
متن کاملTowards Cost-based Optimizations of Twig Content-based Queries
In recent years, many approaches to indexing XML data have appeared. These approaches attempt to process XML queries efficiently and sufficient query plans are built for this purpose. Some effort has been expended in the optimization of XML query processing [20]. There are not many works that take cost-based query optimizations into account. In work [20], we find some cost-based considerations,...
متن کاملAn OLAP Tool Based on the Bitmap Join Index
Data warehouse and OLAP are core aspects of business intelligence environments, since the former store integrated and time-variant data, while the latter enables multidimensional queries, visualization and analysis. The bitmap join index has been recognized as an efficient mechanism to speed up queries over data warehouses. However, existing OLAP tools does not use strictly this index to improv...
متن کاملH2RDF+: High-performance distributed joins over large-scale RDF graphs
The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work...
متن کاملParallel Selection Query Processing Involving Index in Parallel Database Systems
Index is an important element in databases, and the existence of index is unavoidable. When an index has been built on a particular attribute, database operations (e.g. selection, join) on this attribute will become more efficient by utilizing the index. In this paper we focus on parallel algorithms for selection queries involving index – that is data searching on indexed attributes. In this pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006